On modeling and tolerating incorrect software

نویسندگان

  • Anish Arora
  • Marvin Theimer
چکیده

Distributed systems have to deal with the following scenarios in practice: bugs in components; incorrect specifications of components and, therefore, incorrect use of components; unanticipated faults due to complex interactions or to not containing the effects of faults in lower-level components; and evolution of components. Extant fault tolerance models deal with such scenarios in only a limited manner. In particular, we point out that state corruption is inevitable in practice and that therefore one must accept it and seek to correct it. The well-known concepts of detectors and correctors can be used to find and repair state corruption. However, these concepts have traditionally been employed to immediately detect and correct errors caused by misbehaving system components. Immediate detection and correction is often too expensive to perform and hence we consider the implications of running detectors and correctors only intermittently. More specifically, we address issues that must be dealt with when state corruption may persist within a system for a period of time. We show how to both detect and correct state corruption caused by infrequently occurring “transient” errors despite the ability for it to actively spread to other parts of the system. We also show how to eventually detect all state corruption, even in cases where continually recurring errors are constantly introducing new state corruption. Finally, we discuss the minimum set of capabilities needed from a trusted base of software in order to guarantee the correctness of our algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Software Dependability in the Tandem GUARDIAN System

Abstmct_Based on extensive field failure data for Tandem's GUARDIAN operating system, this paper discusses evaluation of the dependability of operational software. Software faults considered are major defects that result in processor failures and invoke backup processes to take over. The paper categorizes the underlying causes of software failures and evaluates the effectiveness of the process ...

متن کامل

Recommending Auto-completions for Software Modeling Activities

Abstract. Auto-completion of textual inputs benefits software developers using IDEs and editors. However, graphical modeling tools used to design software do not provide this functionality. The challenges of recommending auto-completions for graphical modeling activities are largely unexplored. Recommending auto-completions during modeling requires detecting meaningful partly completed activiti...

متن کامل

Bridging the Gap between Hardware and Software Fault Tolerance

During the last decades several mechanisms for tolerating errors caused by software (design) faults have been put forward. Unfortunately only few experimental programming languages have incorporated them, so these schemes are not available in programming languages and systems that are used in developing modern applications. This is why programmers must either implement these mechanisms themselv...

متن کامل

Relative Performance of Hardware and Software-Only Directory Protocols Under Latency Tolerating and Reducing Techniques

In both hardware-only and software-only directory protocols the performance is often limited by memory access stall times. To increase the performance, several latency tolerating and reducing techniques have been proposed and shown effective for hardware-only directory protocols. For software-only directory protocols, the efficiency of a technique depends not only on how effective it is as seen...

متن کامل

A Software Change Contracts

Software errors often originate from incorrect changes, including incorrect program fixes, incorrect feature updates and so on. Capturing the intended program behavior explicitly via contracts is thus an attractive proposition. In our recent work, we had espoused the notion of “change contracts” to express the intended program behavior changes across program versions. Change contracts differ fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. High Speed Networks

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2005